-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document the 'SELECT ... FOR UPDATE' statement #6671
Conversation
Related issue to document KV layer changes: #6571 |
Nathan this PR is meant to document the user-facing changes related to Please take a look at your convenience and let me know if this is missing any pertinent info that SQL users should have re: SFU, or if anything is incorrect. Also, if you have suggestions re: how to better explain this feature, I'm happy to update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for publishing this Rich!
Do we no longer get previews for these changes?
Reviewed 12 of 12 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rmloveland)
_includes/v20.1/misc/session-vars.html, line 102 at r1 (raw file):
<code>enable_implicit_select_for_update</code> </td> <td><span class="version-tag">New in v20.1</span>: Indicates whether <a href="update.html"><code>UPDATE</code></a> statements acquire locks using the <code>FOR UPDATE</code> locking mode during their initial row scan, which improves performance for contended workloads. For more information about how <code>FOR UPDATE</code> locking works, see the documentation for <a href="select-for-update.html"><code>SELECT FOR UPDATE</code></a>.</td>
This may also include DELETE and UPSERT statements by the 20.1 release. I'm not sure whether you want to mention that now or not.
_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):
Benefits of using `SELECT FOR UPDATE` include: - Reduced [transaction contention][transaction_contention] when your workload requires concurrent access to the same rows.
Transaction contention is not reduced by SELECT FOR UPDATE. In fact, it's increased.
_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
- Reduced [transaction contention][transaction_contention] when your workload requires concurrent access to the same rows. - A substantial reduction the number of [transaction retries][retries] in scenarios where a transaction performs a read and then updates the same row it just read. - Increased throughput.
We should qualify this further. Neither of these statements is true for uncontended workloads. I think we should remove this list and expand upon the second point more, as it's the one that gets at the heart of why users would want to use SELECT FOR UPDATE.
SELECT FOR UPDATE
allows users to acquire pessimistic locks during reads so that they can't be modified between the time that they are read and the time that they are updated. This prevents retries, which occur when operations interleave like this. This also prevents thrashing during these read-modify-write attempts because transactions are forced to queue up during the read operation instead of during the write operation.
All of this is what leads to increased throughput and decreased tail latency for contended operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we no longer get previews
AIUI the security team changed perms on our S3 buckets recently, which affected docs builds. According to some discussion in the #education channel, this should be fixed now.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
_includes/v20.1/misc/session-vars.html, line 102 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This may also include DELETE and UPSERT statements by the 20.1 release. I'm not sure whether you want to mention that now or not.
For now I have filed #6699 to track adding DELETE and UPSERT here later as needed.
_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Transaction contention is not reduced by SELECT FOR UPDATE. In fact, it's increased.
Interesting. Is that because the read-based queueing you describe below causes more transactions to "stack up" behind the one that holds the lock?
_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We should qualify this further. Neither of these statements is true for uncontended workloads. I think we should remove this list and expand upon the second point more, as it's the one that gets at the heart of why users would want to use SELECT FOR UPDATE.
SELECT FOR UPDATE
allows users to acquire pessimistic locks during reads so that they can't be modified between the time that they are read and the time that they are updated. This prevents retries, which occur when operations interleave like this. This also prevents thrashing during these read-modify-write attempts because transactions are forced to queue up during the read operation instead of during the write operation.All of this is what leads to increased throughput and decreased tail latency for contended operations.
Ah, OK. Thanks for this information.
I have updated this section to remove the bulleted list and rewrite things a bit. I tried to simplify the language slightly while still also hitting the points you mentioned. PTAL.
166e278
to
abf80ac
Compare
Also, thanks for the review Nathan! (Since I forgot to say that :-} ) |
Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/abf80ac55e9aee2338d22b5e751d7331a4024de1/ Edited pages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rmloveland)
_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Interesting. Is that because the read-based queueing you describe below causes more transactions to "stack up" behind the one that holds the lock?
Exactly, this is called "pessimistic" locking, as opposed to "optimistic" locking. With SELECT FOR UPDATE, the reads grab locks and queue behind one another instead of executing concurrently. By itself, this would actually hurt performance. However, when performing an UPDATE of a row that was just read, we need to make sure that the row hasn't changed since the read. If we don't grab locks during the read and we're performing contended UPDATEs then there's a good chance that the row has changed and we'll need to restart the entire transaction (ie. we were optimistically hoping the row wouldn't change but doing nothing to prevent that, and it cost us). So SELECT FOR UPDATE is more aggressive (read: pessimistic) with lock, which has a small negative performance impact on its own but can prevent these full txn restarts, which hurt performance significantly more.
_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
read-modify-write workflow
Consider mentioning what this means in the context of SQL (e.g. a SELECT and then an UPDATE on that select).
Also, this should probably mention that the thrashing only happens with contended read-modify-write operations.
In addition,
The thrashing you refer to in the previous sentence is exactly what causes these retries, so this should be reworded a little bit.
v20.1/select-for-update.md, line 13 at r2 (raw file):
<div> {% include {{ page.version.version }}/sql/diagrams/simple_select_clause.html %}
Is this the right diagram to include here? It doesn't actually include any mention of "FOR UPDATE".
abf80ac
to
cb165b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Exactly, this is called "pessimistic" locking, as opposed to "optimistic" locking. With SELECT FOR UPDATE, the reads grab locks and queue behind one another instead of executing concurrently. By itself, this would actually hurt performance. However, when performing an UPDATE of a row that was just read, we need to make sure that the row hasn't changed since the read. If we don't grab locks during the read and we're performing contended UPDATEs then there's a good chance that the row has changed and we'll need to restart the entire transaction (ie. we were optimistically hoping the row wouldn't change but doing nothing to prevent that, and it cost us). So SELECT FOR UPDATE is more aggressive (read: pessimistic) with lock, which has a small negative performance impact on its own but can prevent these full txn restarts, which hurt performance significantly more.
Thanks for explaining this.
_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
Re: translating "read-modify-write" to SQL, and specifying that the thrashing only happens with contended read-modify-write operations, I updated that part to say the following:
Because this queueing happens during the read operation, the thrashing that would otherwise occur (when multiple concurrently executing transactions attempt to
SELECT
the same data and thenUPDATE
the results of that selection) is prevented.
Re: the thrashing causing the retries, I updated that sentence to say the following:
By preventing this thrashing, CockroachDB also prevents the [transaction retries][retries] that would otherwise occur.
PTAL.
v20.1/select-for-update.md, line 13 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is this the right diagram to include here? It doesn't actually include any mention of "FOR UPDATE".
It's not the right diagram. For now, I have deleted this section and diagram from this page (for now) and filed #6720 to get help figuring out how to generate the right diagram.
I notice that the CRDB grammar contains a for_locking_strength
section which has what we want (FOR UPDATE
), but I'm not sure how to get from there to a railroad diagram. The pkg/cmd/docgen
tool is something of a black box (to me); I'll try to figure this out / get help via that other issue.
Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/cb165b38b0852f6dbd4b0f845f602c047b574569/ Edited pages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 4 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
Previously, rmloveland (Rich Loveland) wrote…
Re: translating "read-modify-write" to SQL, and specifying that the thrashing only happens with contended read-modify-write operations, I updated that part to say the following:
Because this queueing happens during the read operation, the thrashing that would otherwise occur (when multiple concurrently executing transactions attempt to
SELECT
the same data and thenUPDATE
the results of that selection) is prevented.Re: the thrashing causing the retries, I updated that sentence to say the following:
By preventing this thrashing, CockroachDB also prevents the [transaction retries][retries] that would otherwise occur.
PTAL.
LGTM now, though you might s/when/if/ and pull that out of the parens. If the UPDATEs aren't contended then it's perfectly fine to not use SELECT FOR UPDATE.
cb165b3
to
8d91d36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
LGTM now, though you might s/when/if/ and pull that out of the parens. If the UPDATEs aren't contended then it's perfectly fine to not use SELECT FOR UPDATE.
Done and done (s/when/if/
and paren removal). Thanks!
Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/8d91d3650fa0486b8ffde5bdd6672dbf6b585d31/ Edited pages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r3, 1 of 1 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lnhsingh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One language suggestion + clean up, but otherwise LGTM
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rmloveland)
_includes/v20.1/sql/select-for-update-overview.md, line 3 at r4 (raw file):
<span class="version-tag">New in v20.1</span>: The `SELECT ... FOR UPDATE` statement is used to order transactions by controlling concurrent access to one or more rows of a table. It works by causing the rows returned by a [selection query][selection] to be locked, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.
It works by causing the rows returned by a selection query to be locked
> It works by locking the rows returned by a selection query,
v20.1/performance-best-practices-overview.md, line 386 at r4 (raw file):
It works by causing the rows returned by a [selection query][selection] to be locked, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.
I think this description should be the same as your select-for-update include above. Something like:
It works by locking the rows returned by a [selection query][selection], such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.
v20.1/select-for-update.md, line 14 at r4 (raw file):
The user must have the `SELECT` and `UPDATE` [privileges](authorization.html#assign-privileges) on the tables used as operands. ## Parameters
Should there also be a Synopsis section?
v20.1/selection-queries.md, line 25 at r4 (raw file):
<div> {% include {{ page.version.version }}/sql/diagrams/select.html %}
This is rendering with what looks like merge conflict messages - <<<<<<< HEAD
and ======
and >>>>>>> Recursive CTE docs
. I think there's also two diagrams showing when there should be one.
Fixes #6000. Summary of changes: - Add docs for the `SELECT ... FOR UPDATE` statement, including an example showing its intended use. - Update other pages re: selection queries to mention the use of `SELECT FOR UPDATE` to reduce retries and improve throughput/latency. - Update Transactions page and Perf Best Practices page to mention `SELECT FOR UPDATE` as a possible solution to thrashing/retry issues. - Add `SELECT FOR UPDATE` to SQL Statements and SQL Feature Support pages. - Add docs for new `enable_implicit_select_for_update` session and cluster setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Lauren!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lnhsingh)
_includes/v20.1/sql/select-for-update-overview.md, line 3 at r4 (raw file):
Previously, lnhsingh (Lauren Hirata Singh) wrote…
It works by causing the rows returned by a selection query to be locked
>It works by locking the rows returned by a selection query,
Updated! Much better, thanks.
v20.1/performance-best-practices-overview.md, line 386 at r4 (raw file):
Previously, lnhsingh (Lauren Hirata Singh) wrote…
It works by causing the rows returned by a [selection query][selection] to be locked, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.
I think this description should be the same as your select-for-update include above. Something like:
It works by locking the rows returned by a [selection query][selection], such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.
Fixed.
v20.1/select-for-update.md, line 14 at r4 (raw file):
Previously, lnhsingh (Lauren Hirata Singh) wrote…
Should there also be a Synopsis section?
Yes! Unfortunately, I have not been able to figure out how to generate a diagram correctly for this variant of SELECT. Filed #6720 - details about the pain are listed there.
v20.1/selection-queries.md, line 25 at r4 (raw file):
Previously, lnhsingh (Lauren Hirata Singh) wrote…
This is rendering with what looks like merge conflict messages -
<<<<<<< HEAD
and======
and>>>>>>> Recursive CTE docs
. I think there's also two diagrams showing when there should be one.
Weird! I am not seeing the merge conflict markers in my local repo. I just rebased everything onto latest master
with no conflicts. Will check to see if the conflict markers are there after I push this update.
8d91d36
to
978cf68
Compare
Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/978cf68494785809963e2ee5d59946f1160bb861/ Edited pages: |
This IS there, my bad. I went and looked at the include. It was introduced in a different PR :-/ I will make the fix in a separate PR since I want to drive some discussion of one of my pet issues #2836 |
Fixes #6000.
Summary of changes:
Add docs for the
SELECT ... FOR UPDATE
statement, including anexample showing its intended use.
Update other pages re: selection queries to mention the use of
SELECT FOR UPDATE
to reduce contention/retries and improvethroughput/latency.
Update Transactions page and Perf Best Practices page to mention
SELECT FOR UPDATE
as a possible solution to contention/retry issues.Add
SELECT FOR UPDATE
to SQL Statements and SQL Feature Supportpages.
Add docs for new
enable_implicit_select_for_update
session andcluster setting.